We are IntechOpen, the world's leading publisher of Open Access books Built by scientists, for scientists

Open access books available 5,300

130,000 155M

International authors and editors

Downloads

Our authors are among the

most cited scientists 154 TOP 1%

Selection of our books indexed in the Book Citation Index in Web of Science™ Core Collection (BKCI)

# Interested in publishing with us? Contact book.department@intechopen.com

Numbers displayed above are based on latest data collected. For more information visit www.intechopen.com

# **Fusion of Interferometric SAR and Photogrammetric Elevation Data**

Loris Copa, Daniela Poli and Fabio Remondino

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/57584

# **1. Introduction**

Digital Elevation Models (DEM) are digital representations of the topographic information of a geographic area and are based on a high number of points defined by the X-, Y- and Z coordinates describing the Earth's three dimensional shape. The term DEM is generic and includes two distinct products: Digital Terrain Models (DTM), describing the ground of the imaged area, and Digital Surface Models (DSM) describing the surface including the above ground objects elevation. DEMs may either be arranged regularly in a raster or in a random point cloud. These products have become a major component for an extensive number of remote sensing and environmental applications, such as mapping, orthoimage generation, virtual terrain reconstruction and visualization, simulations, civil engineering, land monitor‐ ing, forestry, flood planning, erosion control, agriculture, environmental planning, archaeol‐ ogy and others. Different techniques based on Earth imaging products belonging to different families are commonly used in order to obtain the elevation information. The most common techniques are based on Optical imagery, LiDAR (Light Radar) and Synthetic Aperture Radar (SAR). In this chapter the main focus will be on techniques exploiting Optical and SAR sensors: InSAR (Interferometric SAR) and Photogrammetry. The generic term DEM will be used, even if usually the data recorded with optical sensors refers to above ground objects (DSM), while the one obtained with SAR usually is a DSM/DTM composite. An overview and some results will be shown, outlining the main strengths and weaknesses of the approaches, thus leading to the fusion hypothesis in order to exploit the inter-platform complementarity and the intrinsic advantages of both techniques. Two different product levels (raster and point cloud) – on which the fusion approaches have to be built – are also proposed, as well as examples of such approaches.

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

# **2. DEM generation from Synthetic Aperture Radar sensors**

The generation of digital elevation models has been among the first techniques exploiting the phase information of SAR data. Being an active sensing technology, SAR offers day and night operability. However, one of the major drawbacks of this technology is posed by the temporal decorrelation between passes of SAR data. This drawback has been addressed mainly in two different ways. The first one is the minimization or suppression of the temporal separation between two or more acquisitions exploited in the DEM generation process. The second way relies on the selection of different frequencies/polarizations.

The different approaches available for the spaceborne platform as well as the basic interfero‐ metric framework will be introduced in the following Sections.

### **2.1. SAR interferometry**

Interferometry is among the main techniques applied to derive elevation data from SAR images exploiting their phase information, while radargrammetry [1], which has been developed in the eighties, exploits their intensity information (see the radargrammetry chapter for further information). In this section, a brief description of the InSAR technique, as well as an example of processing chain to generate InSAR DEMs, are given. A more in-depth overview of SAR interferometry can be found in [2].

SAR sensors belong to the so-called "active" family of instruments, in that they emit a wave which rebounds on the target and returns to the satellite where it is recorded. The wavelength of the outgoing signal is known as it is consequently its corresponding phase. This implies the possibility to compare the return phase with the reference one, thus allowing their coherence estimate. Ideally, the return phase should be a function of the satellite-target and back distance plus a phase difference which can be measured. In reality, however, several other factors affect the phase value, i.e. systematic effects which have to be considered in order to obtain useful information.

In interferometry, two images covering the same area acquired from the same or slightly different position are used in order to obtain the topographic information. The phase difference between the two acquisitions is used to produce the interferogram. The difference is measured in radians and is recorded in fringes each one representing a 2 π cycle. The signal path length to the ground and back consists of a number of whole wavelengths plus some fraction of a wavelength, this means that the return wave phase is function of the distance between the target and the emitter/receiver. Said fraction can be seen as a phase shift in the returning wave. The total number of whole wavelengths is unknown, as it is the total distance to the satellite, but the phase shift defined by the extra fraction of a wavelength can be measured by the receiver with a high accuracy.

The measured phase is influenced by different factors throughout the sensor-target-sensor path. One of the most important among them coming into play is the interaction with the

ground surface. Reflection, depending on the physical properties of the target material, may cause phase changes. As a consequence, the return signal for every pixel is the result of the summation of the sub-pixel interactions between the signal and a multitude of smaller targets included in the imaged ground area. These targets may show different physical and topo‐ graphic properties between one another, e.g. different dielectric properties, a different satellitetarget distance or varying surface roughness. This leads to a signal which can be uncorrelated with the ones from the surrounding pixels and which is defined as "speckle". Orbital effects, which are other important signal noise contributors, should be considered as well. The importance of these factors derives from geometrical constraints, since images from satellite platforms with different orbits cannot be compared. The spatial position from which the images are acquired should be as close as possible and the data comparable, hence they must come from the same orbital track of a single satellite platform. Their baseline difference, i.e. their perpendicular distance difference, is known within a few centimeters and causes a regular phase difference in the interferogram, which is modelled and removed. The position offset also implies a difference in the topographic effects on phase between the images. The topo‐ graphic height needed to produce a phase change fringe, known as altitude of ambiguity, becomes smaller when the baseline becomes longer. The latter aspect can be exploited in order to obtain the topographic height of the target, allowing the generation of a DEM. The topo‐ graphic phase contribution can be quantified and subtracted from the interferogram by means of a previously available coarser DEM in conjunction with the orbital information.

Once these contributions have been identified and removed, the final interferogram contains the residual signal, despite the presence of remaining phase noise. The residual signal represents the phase change caused by the varying distance of the ground pixel from the satellite. In the interferogram, one phase difference fringe is recorded when the ground motion corresponds to half the radar wavelength. In the two-way travel distance this tallies with an increase of a whole wavelength. The absolute movement of a point can either be obtained by means of Ground Control Points (GCPs) recorded from GPS or similar instruments or by assuming that an area in the interferogram did not experience any deformation.

In the following paragraphs, an example of processing chain used to produce interferograms and subsequently DEMs is outlined. First, the two-pass SAR images have to be focused, exploiting a phase preserving SAR focusing. Subsequently, the images have to be coregistered, to quantify their offset and geometric difference, thus ensuring their inter-comparability. A coarse and fine coregistration procedure exploiting a coarser DEM is used to define the parameters for the inter-image point alignment. The parameters are iteratively refined, resampling the slave image to match the ground area of the master image. A common Doppler bandwidth filtering is an additional step to apply before the interferogram generation in order to minimize the decorrelation which can be caused by the presence of an offset in the Doppler centroids.

The Interferogram is then generated, a spectral shift filtering is performed and the Hermitian product is computed. The interferometric phase due to the curvature of the Earth is also removed by DEM flattening. In this process synthetic fringes are generated either from the

ellipsoidal height or from a coarser DEM exploiting a backward geocoding approach. These fringes are cross-multiplied by the SAR interferogram, allowing the removal of the wrapped phase low frequency components. After the phase flattening, the complex interferogram is put through a filtering process to improve its signal to noise ratio, which in turn will result in a higher height accuracy of the final product. The interferogram must be unwrapped, interpo‐ lating it over the whole 0 to 2π phase values to obtain a continuous deformation field. This has to be carried out in order to solve the 2π ambiguity induced by the cyclicity of the phase values in the interferogram. A common approach is to apply a region growing based approach where a coherence threshold is applied in the growing process. If the images are characterized by large areas of low coherence, limiting the growing ability, a Minimum Cost Flow approach can be applied. This approach considers a regular square grid over the whole image. All the pixels whose coherence is lower than a user defined threshold are masked out. A third approach is derived from the latter, here the Delaunay polygons are used only on those areas exceeding the aforementioned coherence threshold. As a result only the points showing good coherence are unwrapped, avoiding any influence from the less coherent ones. The use of the Delaunay triangular polygons also allows the exclusion of eventual irregular areas of small size showing poor coherence, thus avoiding the representation of phase jumps in the inter‐ ferogram. Eventual unwrapping errors can then be corrected in a following phase editing step, which can be both a semi-automatic or a fully manual procedure.

The following step aims at optimizing the geometric characteristics of the orbits, in order to improve the phase to height conversion and to ensure a precise geo-location. With this purpose, ground control points (GCPs) have to be provided, introducing absolute location information. Several ways to collect GCPs are possible. The most precise one being the "in-situ" collection through GPS, although it is the most time consuming as well. Another option is to manually collect the GCPs on the image itself, obtaining the height from a reference DEM. This process can also be accomplished in an automatic fashion, with the software providing the user with a series of candidate pixels – ideally the best suited for the process. The synthetic phase, which has been subtracted in the flattening process, is then added back to the unwrapped phase. This ensures that the final DEM is obtained only from the SAR data. Ultimately, the DEM is obtained in the phase-to-map conversion, in which the relation between the phase and the topographic distance is exploited in order to obtain the elevation data composing the DEM. The final product is projected onto the desired cartographic reference system, taking into account all the geodetic and cartographic parameters needed. During this process the DEM precision can also be extrapolated keeping in mind that phase and topography are proportional, the theoretical elevation standard deviation can be derived from the interferometric phase dispersion. The latter can be obtained from a relation based on the signal wavelength, and represents the dispersion of the estimates around an expected value. This last aspect is of main importance to quantify the quality of a DEM, as well as providing useful information for further processing steps such as mosaicing or, as it will be explained in the following sections, in DEM fusion. An example of a DEM along with its respective precision map is shown in Figure 1.

**Figure 1.** Example of a TanDEM-X DEM (left) and the respective Precision map (right) over a region of Australia.

#### **2.2. Spaceborne InSAR DEM generation**

The ERS-1/2 (European Remote Sensing satellite 1 and 2) Tandem mission (1995, [3][2]) has been the first to give the ability to cover the same area with two instruments over 1 day. The two satellites shared the same orbital plane, satisfying the orbital requirements imposed by an interferometric process. Moreover, the 1 day coverage interval is a very valuable characteristic to generate DEMs, since it lessens the temporal decorrelation issue. A nation-wide operational example of a DEM produced using SARscape® is given in Figure 2, see [4] for a technical description. The DEM covers the whole surface of Switzerland, approximately 42'000 km², with a 25 m horizontal resolution and height accuracies ranging from 7 to 15 metres, going from moderate to steep topography respectively.

**Figure 2.** Example of a nation-wide ERS-1/2 DEM over Switzerland.

The decorellation issue of spaceborne SAR data has then been the main object of the Shuttle Radar Topography Mission [5]-[7]. The design of the instrument allowed the acquisition of space-based single-pass interferometric data, which is the best suited for InSAR DEM gener‐ ation. The mission was flown on board of the Space Shuttle in February 2000 and was designed to obtain elevation radar data on a near-global scale. The main objective was to generate the most complete high-resolution digital topographic database of planet Earth. The instrument's C-Band and X-Band radar sensors were used to cover most of Earth's (land) surface lying between latitude 60 degrees North and 56 degrees South, providing a coverage of some 80 per cent of Earth's land.

The second version (V2) of the SRTM-C digital topographic data, also known as the finished version, has been released in recent years. This version resulted from an extensive editing effort led by the National Geospatial Intelligence Agency (NGA). Some of the most valuable features of this version are represented by the well-defined water bodies and coastlines, as well as the absence of single pixel errors such as spikes and wells. Note however that some areas of missing data (voids) can still be found. Moreover, in Version 2 directory one can also find a vector mask defining the coastline. The vector layer has been defined by NGA and is commonly known as the SRTM Water Body Data. The data is available at reproduction cost or as free download, and offers a 3 arc-seconds (about 90 m) spatial resolution.

After the availability to the public of the first SRTM data, an important effort has been invested in its accuracy validation. The accuracy assessment methodologies and tests have at first been applied to limited areas, some examples can be found in [8]-[11]. Finally, the accuracy assessment of the whole dataset has been accomplished as well [12]-[15]. The complementarity between C and X has also been studied [16][17].

An important and peculiar issue related to the generation of the SRTM-C final DEM has been the filling of the voids plaguing the processed SAR data. The main areas in which this issue was present were characterized by severe topography or very low backscatter, characteristics renowned for their effect on SAR signal acquisitions. The filling subject has been covered in a number of publications, for example in [18]-[23], in which the authors proposed approaches either based on pure spatial interpolation or also incorporating the height information derived from other datasets. A first example of a completely filled SRTM dataset is the SRTM DEM Version 3, generated by CGIAR. A later version (Version 4), has also been released. In the latter, the voids have been further filled by including more auxiliary DEMs and SRTM30 data where available. Moreover, improved interpolation techniques [24] have been applied to enhance the product. These datasets are freely available online [25].

The SRTM dataset is definitely suitable as input layer in a vast number of applications for different topics and disciplines. The major contribution offered by SRTM can be resumed in the availability of a dataset with very good height accuracy, even considering its spatial resolution. Moreover, a good consistency of its characteristics over a very large area is available, allowing studies focussing on very large regions, where DEMs at higher resolution / height accuracy might also be obtained afterwards (in a preliminary study) or at the same time. The application fields in which SRTM data can be used are disparate. Applications to geo‐ morphology [26][27], volcanology [28], hydrological studies [22],[29]-[31], forestry research [32]-[34], archaeology [35], glaciers volume monitoring [36], and many others have been proposed.

As discussed, SRTM data offers a large coverage, but at the expense of spatial resolution. In order to improve the results obtained from the SRTM mission and to exploit the existing highor very-high resolution SAR missions, the research has currently focused on the identification of satellite SAR missions to complement the existing ones with the main purpose of very high resolution DSMs. The main concept, introduced among others by [37][38], is the definition and assembly of satellite constellations composed by either passive or active SAR satellites orbiting around a master sensor. Single-pass capabilities are another main requirement that should be fulfilled. The concept has been studied in [39]-[43], finally resulting in the TanDEM-X (Terra‐ SAR-X add-on for Digital Elevation Measurement, [44]) mission, launched in 2010. The latter will work in conjunction with the TerraSAR-X mission, providing a second satellite orbiting around the first one. This type of pairing provides a continuous combination of along-and across-track baseline, suitable respectively for currents monitoring and for DEM generation. This satellite constellation will be used to produce the WorldDEM™, a DTED (Digital Terrain Elevation Data) level 3 DSM [46], which will be available in 2014. Some initial examples of such very-high resolution DEMs, have been shown by [47] and [48]. One example of TanDEM-X product is shown in Figure 3.

**Figure 3.** Example of a TanDEM-X DEM (right) compared to an SRTM-V4 DEM (left) over Australia.

A similar concept has been developed for the COSMO-SkyMed mission (see [45]), also completed in 2010 with the launch of the fourth satellite of the constellation. The missions considered in these studies take into account high frequency (X-Band) sensors. A more indepth example of a COSMO-SkyMed DEM compared to the SRTM-V4 equivalent is given in Figure 4.

In Figure 4, the substraction of the two DEMs is given, showing the residuals in meters (midleft image). This map and the corresponding normal distribution (mid-right image), show that on average the relative positioning between the two products is quite similar, with the majority of the observations being between the -12/12 meters range. However, the COSMO-SkyMed

**Figure 4.** Comparison between an SRTM-V4 DEM (top-left) and a COSMO-SkyMed DEM (top-right). Height residual in meters between the DEMs (mid-left) and corresponding residual distribution (mid-right). Elevation profile over a test zone of SRTM-V4 (bottom-left) versus COSMO-SkyMed (bottom-right), the test profiles were collected over the input DEMs as shown by the red line in the top images. Image collected over the Malawi region. Image courtesy of A.S.I. Agenzia Spaziale Italiana.

DEM has a much higher spatial resolution and altimetric precision than the SRTM data, allowing for a better slope definition as shown by the profiles at the bottom of Figure 4. In the SRTM data one can clearly see that the altimetric and geometric detail is lost. The better slope representation of COSMO-SkyMed also allows a better absolute radiometric correction of intensity data, while, having such a higher three dimensional spatial definition, the data is particularly suited in geo-information processing.

An alternative to these solutions is the exploitation of ALOS PALSAR L-Band data, if one looks at stable scattering mechanisms at lower frequencies. The lower frequency of the PALSAR system offers higher penetration in vegetated areas, resulting in a higher temporal correlation. This results in a lower phase noise compared to C-Band and X-Band systems [50]. In Figure 5, an example of PALSAR DEM is given in comparison with the respective SRTM counterpart. A better slope definition is appreciable with ALOS-PALSAR as well, as shown by the profiles at the bottom of Figure 5.

**Figure 5.** Comparison between SRTM-V4 DEM (top-left) and ALOS-PALSAR DEM (top-right) over the Malawi region. Elevation profile over a test zone of SRTM-V4 (bottom-left) versus ALOS-PALSAR (bottom-right), the test profiles were collected over the input DEMs as shown by the red line in the top images

# **3. DEM generation from spaceborne optical sensors**

#### **3.1. VHR optical sensors**

Today an increasing number of Earth-observation platforms are equipped with Very High-Resolution (VHR) optical sensors characterized by a ground resolution of less than 1m [51], enabling the discrimination of fine details, like buildings and individual trees. The radiometric and geometric quality of the satellite images can be compared with original digital aerial images. The image orientation has been simplified by using rational polynomial functions [52] and the direct sensor orientation has been improved, allowing, in some cases, the image processing and DEM generation without ground control information [53]-[54]. Complement‐ ing the high spatial resolution, VHR sensors are mounted on highly agile platforms, which enable rapid targeting, a revisit time up to 1 day and the stereoscopic coverage within the orbit for 3D information recovery and DEM generation. The cost of the stereo acquisition (two stereo images) is in general equal to the double of a single acquisition. The availability of stereo acquisitions in the archives of image distributors is much lower than the single acquisition mode, but stereo acquisition can be planned on demand.

#### **3.2. Image acquisition**

Earth observation optical sensors mounted on satellites acquire mainly in pushbroom mode. The imaging system of a pushbroom sensor consists of an optical instrument (usually a lens or a telescope) and Charge Coupled Devices (CCD) lines assembled in the focal plane. While the platform moves along its trajectory, successive lines are acquired perpendicular to the satellite track and stored in sequence to form a strip. In case of multispectral sensors, a strip is generated for each channel. In most cases the sensors are placed along a single line or in two or more segments staggered along the flight direction (i.e. EROS-B) or butted with some overlap (i.e. Quickbird) to increase the image ground resolution through a specific geometric and radiometric post-processing.

In order to produce DEMs, stereoscopic coverage is mandatory. Amongst satellites using a stereo acquisition mode, the following can be distinguished: (1) standard across-track systems, (2) standard simultaneous multi-view along-track systems, and (3) agile single-lens systems.

In the standard *across-track configuration*, the CCD lines and one optical system are generally combined with a mirror that rotates from one side of the sensor to the other across the flight direction, up to 30°, with the overlapping area across the flight direction. According to this configuration, stereo images are collected from different orbits at different dates, with temporal variation between the images. Examples are very popular high resolution (HR) satellite sensors, like SPOT-1-4 HRV and SPOT-5 HRG by CNES, IRS-1C/1D PAN by ISRO, Beijing-1 CMT by the People Republic of China and Kompsat-1 EOC by Korea Aerospace Research Institute (KARI). In the standard *along-track configuration*, two or more strips are taken simultaneously from the same orbit at different angles along the flight direction. For each viewing direction there are one lens and one set of CCD lines placed on the focal plane. The along-track angles are generally fixed and the base-over-height (B/H) ratio constant for each satellite. This same-date along-track configuration thus minimizes the temporal variation between the images. Examples of HR sensors with this configuration are SPOT-5 HRS by CNES, ALOS-PRISM by JAXA, Cartosat-1 by ISRO and ZY-3 by the People Republic of China.

The last generation VHR sensors use an *agile configuration* that was first introduced on IKONOS-2 in 1999. These sensors are carried on small and flexible satellites flying along sunsynchronous and quasi-polar orbits and use a single-lens optical system [59]. For stereo viewing or frequent temporal coverage, they have the ability to rotate on command around their cameras axes and view the target from different directions, in forward mode (from North to South) or reverse mode (from South to North). Therefore, along-track and across-track stereo pairs of a particular area of interest are planned in advance and acquired on almost any part of the Earth's surface. The general limit for off-nadir angles is 45º, but larger values are used in some specific situations. Some agile sensors have a synchronous acquisition mode, thus the satellite speed and the scanning speed are equal, and the viewing angle is constant during one image acquisition. Examples are IKONOS-2, Kompsat-2 and Formosat-2 RSI by National Space Program Office of China (NSPO). On the other hand more agile sensors, including Quickbird, WorldView 1 and WorldView-2 (DigitalGlobe), GeoEye-1 (GeoEye), EROS-A and –B (Image‐ Sat International), Orbview-3 (Orbimage) and TopSat (QinetiQ), Pléiades 1A and 1B (Astrium) scan the Earth in asynchronous mode: the platform velocity is higher than the scanning one, therefore the satellite rotates continuously during the acquisition and a CCD line scans longer a line on the ground. The limitation of this scheme is that the geometry is less stable. The success of agile single-lens systems for the acquisition of VHR stereo imagery is confirmed by its planned use in future missions too. GeoEye-2, planned for launch in 2013, and Worldview-3, planned for 2014, will have the stereo capability of their processors GeoEye-1 and WorldView-2 respectively.

Images acquired by VHR sensors are distributed with a certain level of processing; unfortu‐ nately, the range of terminology used to denominate the same type of image data is different at each data provider. In general, three main processing levels can be distinguished: a) raw images with only normalization and calibration of the detectors without any geometric correction (this level is preferred by photogrammetrists working with 3D physical models); b) geo-referenced images corrected for systematic distortions due to the sensor, the platform and the Earth rotation and curvature; c) map-oriented images, also called geocoded images, corrected for the same distortions as geo-referenced images and North oriented. Generally, very few metadata related to sensor and satellites are provided; most of metadata are related to the geometric processing and the ellipsoid/map characteristics.

#### **3.3. Image processing**

The standard photogrammetric workflow for digital surface modeling and 3D information extraction from stereo optical images is summarized in Figure 6. Two major steps, the image orientation and image matching for DEM generation, are discussed in the following sections.

**Figure 6.** Photogrammetric workflow for DEM and 3D information extraction from stereo imagery.

### **4. Image orientation**

For DEM generation, the orientation of the images is a fundamental step and its accuracy is a crucial issue during the evaluation of the entire production pipeline. Over the last three decades various models based on different approaches have been developed [57]. Rigorous models try to describe the physical properties of the image acquisition through the relationship between the image and the ground coordinate system [58]. In case of images acquired by pushbroom sensors, each image line is the result of a perspective projection. The mathematical model is based on the collinearity equations, which relate the camera system (right-hand 3D system centered on the instantaneous perspective center, with the *y* axis parallel to the image line and the *x* axis perpendicular to the *y* axis, closely aligned with the flight direction) to a 3D local coordinate system. The collinearity equations are extended to include exterior and interior orientation modeling of pushbroom sensors. Usually ephemeris information is not precise enough for accurate mapping, and the exterior orientation has to be further improved with suitable time-dependent functions and estimated in a bundle adjustment. Simple thirdorder Lagrange polynomials [60][61], quadratic functions [62] and piecewise quadratic polynomials [56] have been proposed for this purpose. In [63] a study of a number of trajectory models is given. For initial approximation of the modeling parameters, the physical properties of the satellite orbit and the ephemeris are used. With respect to the interior orientation, suitable parameters are added to model the optical design (number of lenses, viewing angles), the lens distortions and the CCD line distortions. A detailed description of the errors can be found in [64].

In recent years, 3D rational functions (RFs) have become the standard form to approximate rigorous physical models of VHR sensors. They describe the relation between normalized image and object coordinates through ratios of polynomials, usually of the third order. The corresponding polynomial coefficients, together with scale and offset coefficients for coordi‐ nate normalisation, form the so-called rational polynomial coefficients (RPCs) and are distributed by image vendors as metadata information. Anyway to obtain a sensor orientation with sub-pixel accuracy, the RPCs need to be refined with linear equations requesting more accurate GCPs or, more commonly, with 2D polynomial functions ([65], [66] and others). For the latter solution, one or two GCPs are used for zero-order 2D polynomial functions (bidirectional shift) and six to ten GCPs for first and second-order 2D polynomial functions to compute their parameters with a least squares adjustment process. In the case of stereo-images, a block adjustment with RPC is applied [66] for the image orientation. RPCs are widely adopted by image vendors and government agencies, and by almost all commercial photogrammetric workstation suppliers.

#### **4.1. Image matching**

Image matching is referred to the establishment of correspondences between primitives extracted from two (stereo) or more (multi-view) images. Once the image coordinates of the correspondent points are known in two or more images, the object 3D coordinates are estimated via collinearity or projective model. In image space this process produce a *depth*

*map* (that assigns relative depths to each pixel of an image) while in object space we normally call it *point cloud*. For more than 50 years, image matching has been an issue of research, development and practical implementation in software systems. Gruen [67] reports the development of image matching techniques over the past 50 years while in [68] more critical analyses and examples are reported.

Today many approaches for image matching exist. One first classification is based on the used primitives, which are then transformed into 3D information. According to these primitives, the matching algorithms can be *area-based matching* (ABM) or *feature-based matching* (FBM) [69]. ABM, especially the LSM method with its sub-pixel capability, has a very high accuracy potential (up to 1/50 pixel) if well textured image patches are used. Disadvantages of ABM are the need for small searching range for successful matching, the large data volume which must be handled and, normally, the requirement of good initial values for the unknown. Problems occur in areas with occlusions, lack of or repetitive texture and if the surface does not corre‐ spond to the assumed model (e.g. planarity of the matched local surface patch). FBM is often used as alternative or combined with ABM. FBM techniques are more flexible with respect to surface discontinuities, less sensitive to image noise and require less approximate values. Because of the sparse and irregularly distributed nature of the extracted features, the matching results are in general sparse point clouds which are then used as seeds to grow additional matches.

Another possible way to distinguish image matching algorithms is based on the created point clouds, i.e. *sparse* or *dense* reconstructions. Sparse correspondences were the initial stages of the matching developments due to computational resource limitations but also for a desire to reconstruct scenes using only few sparse 3D points (e.g. corners). Nowadays all the algorithms focus on dense reconstructions-using stereo or multi-view approaches. A dense matching algorithm should be able to extract 3D points with a sufficient resolution to describe the object's surface and its discontinuities. Two critical issues should be considered for an optimal approach: (i) the point resolution must be adaptively tuned to preserve edges and to avoid too many points in flat areas; (ii) the reconstruction must be guaranteed also in regions with poor textures or illumination and scale changes.

A rough surface model of the terrain is often required in order to initialize the matching procedure. Such models can be derived in different ways, e.g. by using a point cloud interpo‐ lated from the tie points measured at the orientation stage, from already existing terrain models, or from a global surface model generated with 3-second DEM by Shuttle Radar Topography Mission (SRTM). The matching procedures for terrain modelling are generally pyramid-based, that is, they start from a high image pyramid level, and at each pyramid level matching is performed; the results at one level are used to update the terrain range at the next lower pyramid level. In this way, the ambiguity of terrain variation is reduced at each pyramid level and search range are reduced as well. The algorithm convergence is a function of terrain slope, accuracy, and pixel size.

As dense matching is a task involving a large computing effort, the use of advanced techniques like parallel computing and implementation at GPU / FPGA level can reduce this effort and allow real-time depth map production.

The accuracy and reliability of the derived 3D coordinates rely on the accuracy of the sensor calibration and image orientation, the accuracy and number of the image observations and the imaging geometry (i.e. the effects of camera optics, image overlap and the distance cameraobject). A successful image matcher should also employ strategies to monitor the matching results and quality with statistic parameters. The weaknesses of the image matching technique for DEM generation is the requirement of good texture in the images, the absence of dark and shadowed areas and the relative small baseline over flying height (B/D) ratio in order to have very small perspective distortion in the images.

# **5. DEM Quality assessment**

The following evaluation is the result of two studies conducted on DEM estimation from VHR optical sensors from spaceborne platforms, and deeply described in [55], [70] and [71]. In these studies stereo scenes from three latest VHR resolution sensors were taken into account: GeoEye-1 (GE1, launched in 2011, WorldView-2 (WV2, launched in 2012) and Pleiades-1A (PL1, launched in 2012).

In the first study DEMs were generated from stereo images acquired by GE1 on Dakar (Senegal) and Guatemala City (Guatemala) and by WV2 on Panama City, Constitucion (Chile), Kabul (Afghanistan), Teheran (Iran), Kathmandu (Nepal) and San Salvador (El Salvador). The aim of this work was to evaluate the potential of VHR satellite images for the modeling of very large urban areas for the extraction of value-added products for hazard, risk and damage assessment.

The main characteristics of the images and areas are briefly reported. Due to the large extent of the cities (up to 1'500 km<sup>2</sup> ), the datasets generally consist of multiple pairs (or couples) of stereo images acquired by the same sensor and cut in tiles. If the stereo pairs are acquired in the same day, the time difference between their acquisitions is less than 1 hour. In case of multiple dates, differences are up to 3 months (i.e. Dakar). The viewing angles and conse‐ quently the convergence angle and B/H ratio are not the same for all stereo pairs. Different situations occur: a) one quasi-nadir image (acquisition angle close to the vertical) and one offnadir image (backward or forward viewing); b) one backward and one forward image with symmetric angles, and c) one backward and one forward image with asymmetric angles and large convergence angle. Large viewing angles determine the presence of occlusions, mainly in urban areas, larger shadows, and larger GSD, with respect to quasi-nadir acquisitions. Areas like San Salvador and Kabul, smaller than 15-17 km in width, were scanned in one path. In case of larger areas the images were acquired in the same day from two (Panama City, Guatemala City, Constitucion) or three (Teheran or Kathmandu) different paths, or in different days (Dakar). In each path the acquisition angles are almost constant, but different between paths. This might cause differences in the DEM on the overlapping areas between paths, as the sensor performance for DEM generation depends on the B/H ratio, and consequently on the incidence angles of the stereo images.

GE1 stereo images were provided at geo-referenced processing level, while the WV2 ones were provided in raw level. In all cases, the rational polynomial coefficients (RPC or RPB formats) available for each image or tile were used as geo-location information for the geometric processing.

In general the geo-location accuracy of the RPC depends on the image processing level, the terrain slope and the acquisition viewing angles ([57],[58]). In flat areas like in Panama City (level 2A,-16º and 16º viewing angles) the relative accuracy of the RPC between two images composing a stereo pair is approximately 30m, while in the mountain area around Kathmandu (level 1B,-31º and 15º viewing angles) the relative accuracy between the stereo images reaches 300m.

The datasets mainly cover inhomogeneous urban areas with different layout: dense areas with small buildings, downtown areas with skyscrapers, residential areas, industrial areas, forests, open areas and water (sea, lakes, and rivers). In addition images include rural hilly and mountain areas surrounding the cities, with important height ranges: 2'400m in case of Kathmandu, almost 2'300m in case of Teheran, 1'100m in case of Guatemala City, 1'000m in case of Kabul. Some regions were not visible in the images because occluded by clouds or very dark cloud shadows, as in Panama City and Kathmandu cities.

The processing for DEM generation was applied separately to each dataset. On the orientation point of view, the geometric model for spaceborne pushbroom sensors based on RPC was used.

Ground points were not available, so at least the relative orientation of the images was guaranteed. Common tie points in two or more images were measured homogeneously in the images in order to ensure the relative orientation between the two images of the same stereo pair and between different stereo pairs that overlaps along or across the flight direction, and get a stable block. A minimum of 5 tie points for each pair were measured manually by an operator on well-defined and fixed/stable features on the terrain (i.e. crossing lines, road signs). Taking into account the ground spatial resolution of the input images, the DEMs were generated with 2m grid spacing (about 4 times the pixel size of the panchromatic channels). In case of projects with overlapping stereo pairs (i.e. Teheran, Kathmandu, Dakar, etc.), the DEM was computed separately for each stereo pair and then the single DEMs were merged using linear interpolation.

The second work on DEM quality assessment was carried out on a testfield with accurate ground reference data [70]. The testfield is located in Trento, in the Northeast of Italy, and varies from urban areas with residential, industrial and commercial buildings at different sizes and heights, to agricultural or forested areas, and rocky steep surfaces, offering therefore a heterogeneous landscape in term of geometrical complexity, land use and cover. The height ranges from 200m to 2100m. The testfield includes several heterogeneous datasets at varying spatial resolution, ranging from satellite imagery to aerial imagery, LiDAR data, GNSS surveyed points, GIS data, etc. For the scope of this Chapter, the stereo images from VHR satellite sensors used for DEM quality assessment are:

**a.** A WorldView-2 (WV2) stereo-pair, acquired on August 22, 2010. The first image was recorded in forward scanning mode with an average in-track viewing angle of 15.9°, while the second one was acquired in reverse scanning mode with an average in-track viewing angle of-14.0°. The processing level is Stereo 1B, i.e. the images are radiometrically and sensor corrected, but not projected to a plane using a map projection or datum, thus keeping the original acquisition geometry. The available channels are the panchromatic one and eight multispectral ones. The images cover an area of 17.64×17.64 km. The images were provided with Rational Polynomial Coefficients (RPCs).


For the geometric processing of the stereo scenes, the commercial software SAT-PP (SATellite image Precision Processing) by 4DiXplorer AG [73] was used. Information about the software functionalities and the approaches for image orientation and DEM generation is given in [74]. The images were oriented based on the RPC-based model, by estimating the parameters modelling an affine transformation to remove systematic errors in the given RPCs. For this operation, a selection of available ground points visible in both images was used. A sub-pixel accuracy was reached in the orientation both for WV2 and GE1 stereo-pairs and PL1 triplet. The DEMs were generated with a multi-image least-square matching, as described in [74], using a grid space equal to 2 times the GSD, which leads to 1 m geometric resolution surface models. Few seed points were manually measured in the stereo images in correspondence of height discontinuities to approximate the surface. The DEMs were neither manually edited nor filtered after their generation. Table 1 shows the DEM analysis in three test areas: 1. Trento city center, characterized by small and buildings closed to each other, 2. Trento train station, with large flat buildings and open areas and 3. Fersina residential area, with separated buildings and open areas. The reference LiDAR DEM are shown together with the error maps of GE1, WV2 and PL1 DSMs. In the error maps the height differences in the range [-1 m, 1 m] were plotted in white, as they are within the intrinsic precision of the sensors, while large positive and negative errors were highlighted in red (LiDAR above the DEM) and blue (DEM above LiDAR), respectively.

By visual analysis of the DEMs, it was observed that in all datasets the surface shape was well modelled with both sensors (GE1 with processing level 2A and WV2 with processing levels 1B and 2A). In mountain areas with large elevation difference, like Trento, Teheran, Kabul and Guatemala City, the shape of valleys and mountain sides and ridges is well modeled in the DEM (Figure 7). In comparison to SRTM, using VHR images it is possible to extract finer DSM and to filter the Digital Terrain Model (DTM) at higher grid space. This is confirmed by the

comparison of the height profiles of the WV2 DTM and the SRTM in Teheran (Figure 8). In rural areas, cultivated parcels can be distinguished, together with paths and lines of bushes and trees along their sides. In case of forest, the DEM clearly shows a different height with respect to adjacent cultivated areas or grass. It is even possible to distinguish roads and rivers crossing forests (Figure 9). In general rural areas are well modelled both in mountain, hilly and flat terrain.

In urban areas building agglomerations, blocks with different heights, the road network, some infrastructures (i.e. stadium, bridges, etc.) and rivers are generally well outlined both in flat and hilly terrains (Figure 10, Table 1). In residential and industrial areas it is possible to distinguish single buildings and lines of trees (Figure 11, Table 1). In some cases the roof structures of large and complex buildings are modeled (Figure 12). Errors are encountered between buildings, as narrow streets are not visible in the stereo-pairs due to shadows or occlusions. In these cases the DEMs overestimates the height of the terrain, as they do not model the street level (Table 1). In case of Trento dataset (Table 1), the two large red spots (highlighted in red and blue in the LiDAR DEM) occur on churches, in all three DEMs. The matching failure is likely due to the homogeneous material used for the roof cover, the shadowing, and, eventually, the structure geometry. The manual measurements of seed points on the two buildings would certainly help the matching procedure. In the train station test area, some differences are due to the presence of trees (blue circle, LiDAR is below the DEMs) and to a change in the area (red circle, LiDAR is above the DEMs), that is, a building was demolished between LiDAR and WV2/GE1/PL1 acquisitions. Regarding the third test area (Trento Fersina), significant height differences occur in correspondence of trees (blue areas) and between tall buildings.

By comparing the statistics of the error maps, in general there is a good agreement between the three surface models generated by PL1, WV2 and GE1 and the values in the statistics confirm the above analysis. The three surface models give similar results in terms of minimum and maximum values, mean value and RMSE, being Pleiades slightly better than the other two.

From the analyses above reported, it can be concluded that failures in surface modelling from VHR optical sensors can be caused by a number of factors, which are summarized in Table 2.

The image absolute geolocation accuracy, which depends on the viewing angle, on the processing level and on terrain morphology, influence the estimation of the height in object space, and therefore the quality of the absolute geolocation of the final DEM and related products (DTM, orthophotos, 3D objects). The measurement of accurate and well distributed ground control points (GCPs) in the images would solve the problem, but this information is often not available. The accuracy of the relative orientation between two images forming a pair is crucial for the epipolar geometry and image matching, while the relative accuracy between overlapping stereo pairs is responsible of height steps in the final DEM (Figure 13). The size of the height steps depends on the accuracy of the geometric orientation of each stereopair, and generally shows a systematic behavior. In both cases the relative orientation can be improved by manually measuring a sufficient number of common tie points between the images.

**Figure 7.** Trento. DEM from the GE1 stereo-pair; visualization in colour-shaded mode in SAT-PP software.

**Figure 8.** Teheran. Height profile in WV2 DTM (blue) and SRTM (red) and zoom in the corresponding surface models (above: SRTM, below: WV2 DTM).

**Figure 9.** Left: DEM of Rural area in Constitucion. Right: Original image (panchromatic)

**Figure 10.** DEM of dense urban area on flat and hilly terrain in Dakar.

**Figure 11.** DEM of residential and rural area on flat terrain in Panama City. The black oval highlights a line of tree.

**Figure 12.** Left: DEM of Panama City Airport. Right: original image (pan sharpened).

**Table 1.** DEM analysis on Trento testfield. Quality evaluation of WV2, GE1 and PL1 DEMs with respect to the LiDAR DEM: orthophoto with test areas contour (light blue) and profile transect (yellow), error planimetric distribution, statistics (minimum m, maximum M, mean μ, standard deviation σ, RMSE) and height profiles. Measures are in meters.

Low-textured and homogenous areas origin blunders in the DEM, as the automatic matching of the homologous points fails. This is typical in homogeneous land cover (i.e. bare soil, parking lots) and shadow areas and is caused by a combination of sun and satellite elevations and surface morphology (i.e. mountain faces). In Figure 14 building shadows, highlighted in the yellow ellipse, bring inaccuracies in the DSM. The use of a better initial DEM as initial approximation can help the matching procedures in these critical areas. If a DEM is not available, so-called seed points can be measured in stereo mode in the pairs and imported in the matching procedure as mass points. In addition, an ad-hoc radiometric processing can enhance details in low-textured regions and help the matching procedure.

Occlusions are generally present in urban areas and are due to tall buildings or trees, in combination with the acquisition viewing angles. In case of occlusions, corridors between buildings are not modeled correctly (Figure 15). To overcome this drawback, multi-angular acquisitions, like in Pleiades constellation, could reduce occlusions in the images.

Objects moving during the acquisitions of the stereo images, like vehicles, lead to small blunders, as highlighted in the blue circles in Figure 14. They can be removed with manual editing or filtering. Local blunders in correspondence of special radiometric effects, like spilling and saturation on roof faces due to the acquisition geometry and the surface type and inclination (i.e. roof faces in grass), may also occur.

**Figure 13.** Height step (mean value: 3.5m, black rectangle) between the DEMs obtained from two different stereo pairs (Dakar).

**Figure 14.** Example of effects of shadows and moving objects in the DEM (Panama City). Above: pansharpened nadir scene; below: DEM

**Figure 15.** Example of corridor occlusion due to tall buildings; (a) and (b): stereo images; (c) resulting DEM (Dakar).


**Table 2.** Summary of factors influencing the DEM quality.

# **6. DEM Fusion**

DEMs of large areas are usually generated either by SAR interferometry, laser scanning (mainly from airborne platforms) or photogrammetric processing of aerial and satellite images. As demonstrated in the previous sections, each sensing technology and processing algorithm (interferometry, image matching) shows its own strengths and weaknesses (see Table 3). Even within a single technology a highly varying DEM quality may be found.

DEMs can be generated at different coverage levels, ranging from very local areas to nearglobal coverage. Especially taking into account digital models offering a vast land coverage, one can see that the accuracy, error characteristics, completeness and spatial resolution offered by these products can vary wildly. These products and the possibility of their joint exploitation, with their respective issues, are the object of the techniques introduced in this Section.

#### **6.1. The motivation: Complementarity**

As mentioned, each different DEM generation approach and platform of origin have their advantages and drawbacks. Such characteristics seem to point into an interesting direction. Especially if one directs its attention on a cross-platform comparison. In Table 3 some charac‐ teristics between optical and SAR DEM generation are resumed. It can be observed that the two technologies almost perfectly complement each other.


**Table 3.** Comparison between SAR and Optical sensing technologies: advantages and drawbacks.

The complementarity between these platforms is also visible if one takes into account the respective DEM quality maps shown in Figure 1 for the SAR example and in Figure 16 for the Optical one. Note that the quality scales are the opposite between the two, since the optical one describes matching cross-correlation combined with other factors (the higher the better), while the other defines the precision in meters (the lower the better). By analysing these examples, one can see that where the SAR result does not excel, i.e. on edges, the optical one shows better performance. And vice-versa where the optical result is not as strong, i.e. poorly textured areas, the SAR one shows consistently better results.

**Figure 16.** Example of a matching Pléiades-1A cross-correlation map over Trento, Italy.

As a consequence, a joint exploitation of the elevation products deriving from these platforms can potentially be highly profitable, since the information on which one is weak can be replaced/completed by the one supplied by the other.

The second typology of data that seems a legitimate candidate for the fusion can be defined as intra-platform, in the sense that two products coming from the same platform may as well be complementary to one another. In SAR, for instance, one may have both a DEM generated with ascending data and one with descending data, as shown in Figure 17, evidently comple‐ menting each other.

**Figure 17.** Example of intra-platform complementarity. Point cloud representation of an ascending ERS-1/2 DEM (left), a descending ERS-1/2 DEM (right) and the combination of the two (center).

In Figure 17 the digital terrain models are reproduced as point clouds (i.e. not interpolated). This is a less biased way to represent elevation data, even if at the expense of the ease of visualization and manipulation.

There are, however, constraints that directly arise in order to be able to compare and combine the fusion candidates. The first one, easier to solve, is dictated by the different pixel spacing. This issue is simply solved by oversampling the coarser DEM and its corresponding quality map. The resampling does not imply that the data has acquired the same precision as the finer one, and, that the oversampling procedure implies a re-estimation of the data values. In the intra-platform case this can be avoided, as in the given example, since the products already possess comparable characteristics non requiring a re-estimate.

The second constraint is to have a comparable quality index. As shown, the quality maps proposed in Figure 1 and Figure 16 imply different types of measures, which must be reinterpreted in order to acquire a cross-significance. The way these indexes can be obtained is given in Table 4.


**Table 4.** Quality map estimation by technique.

In the following sections two different data representation levels, on which the fusion ap‐ proaches are based, will be defined. Example approaches will be outlined as well.

#### **6.2. Raster level fusion**

The primary level which is directly available for fusion is the raster level. In fact, when represented in raster format, DEMs define a height value (when available) for each cell location on a regularly spaced grid covering the whole study area, and are a suitable input for rasterbased fusion approaches.

By exploiting the quality estimates from the previous section, the easiest and sometimes more appropriate method to fuse two raster data is the weighted combination of the two observa‐ tions, namely, a weighted average. This process is executed between DEMs covering the same surface and having the same pixel spacing, resulting in a very fast cell-to-cell calculation. The simplicity of the approach, however, may badly influence the results. The most important aspect to keep in mind when fusing two different datasets is their accuracy with respect to their spatial resolution. When the latter greatly differs between the two, their combination may result in an over-smoothing effect as well as single pixel aberrations and abrupt discontinuities (which in turn represent over-fitting). This results in a major degradation of the information contained in the datasets. When over-smoothing, the quality of the product with higher spatial resolution will be degraded, while when over-fitting the spatial characteristics will be ex‐ tremely exaggerated/accentuated. As such, this kind of processing better suits products having the same pixel spacing from the start, rather than oversampled, and shall be avoided if the original pixel spacing difference is too large.

More advanced methods have been proposed to overcome the issues due to an over simplistic approach. These techniques keep into account the values and statistics not only in the single pixels but also in their neighbours, by exploiting statistics windows, extending the process to a less local estimate and thus ensuring a spatial consistency of the results. This type of approaches are conceptually less sensitive to the over-smoothing and over-fitting effects, since the spatial characteristics of both input data are taken into account. In general, during the fusion process, a higher priority is given to the input with finer spatial resolution. These approaches are more desirable than the first one based on weighted average, since they are less prone to produce outputs greatly inferior to the inputs and, moreover, they are less influenced in case of oversampling, since the window on the finer input will have a higher weight. The difference in pixel spacing should ideally be kept quite small even in this case.

One example of approach following these assumptions can be found in [75], where also additional information is used as input. The latter is constituted of a database (dictionary) of small DEM patches collected over existing DEMs showing similar characteristics to the ones to be fused. Sparse representations are used to solve the fusion problem. In this approach Orthogonal Matching Pursuit is used to identify the most suitable input patches to be used in weighted linear combination defining the output. An example of raster level fusion product is given in Figure 18.

**Figure 18.** Example of raster based DEM fusion (center) between a SPOT-5 product (right) an ALOS-PALSAR-1 product (left).

#### **6.3. Point cloud level fusion**

The second product level available for data fusion is the Point Cloud level. Fusing point clouds, instead of rasterized surfaces, comes natural if one takes into account how Interferometry and Image Matching work, i.e. in a point-wise fashion. This allows to lessen an effect that occurs during the rasterization step of the outputs, i.e. error propagation. By avoiding rasterization, the error value computed in the height estimation process is preserved as is, and corresponds to the input data alone, thus preserving the quality of each observation.

The approaches based on point cloud level, however, show a greater complexity. Firstly, it is a kind of data which is conceptually more abstract to handle than the raster one. While the latter can easily be associated to matrices, the former are vectorial point objects in three dimensions (see Figure 19). This also implies a conceptual complexity of the approach to design for their fusion. The second aspect to take into account is the data size and point density. In case of SAR DEMs covering very large areas, for example, the study area may show very high coherence over the whole scene, implying a large amount of points with very large height variability over a restrained two dimensional (coordinates) space. Finally, with respect to raster-based approaches, more complex procedures must be designed for the inclusion of the information provided by the quality measures into the process.

**Figure 19.** Example of image matching results, shown as point cloud, obtained from Pléiades-1A imagery over the city of Trento.

A possible solution to the overabundance of data is a preliminary sample selection/substitution phase, in which the information about the data can be exploited. In SAR-SAR fusion, precision can be used to drive the choice between redundant points. In SAR-Optical fusion, the instru‐ ment behaviour resumed in Table 3 can be exploited as well. An ideal approach would for instance prefer measurements coming from Optical edge matching rather than its interfero‐ metric counterpart, vice-versa points produced by matching a regular grid over poorly textured areas should be substituted by the interferometric ones. Note however that slope also remains a key factor to keep into account (see Table 3).

During the image matching process it is possible to identify which one is which, and this information should be taken into account and stored, as shown in Figure 20. The selection phase can also be used as an initial combination phase, computing local statistics such as the slope between points, in order to eliminate outliers and substitute the observations according to their quality measure and characteristics.

**Figure 20.** Breakdown of a subset of Figure 19 between: grid-based matching points (left), feature-based matching points (center), edge-and area-based matching points (right).

The second step to accomplish is the generation of the fused DEM itself, hence the definition of an algorithm capable to handle such data and produce the final model. Interpolators can be applied to estimate the final model from the newly defined point cloud. In this perspective, two main types of interpolators can be considered. The first one are the so-called exact interpolators [76], according to which the output estimating function must pass by fixed points (input observations). The second interpolators are the confidence-interval aware interpolators, for which the output function should pass between an "envelope" from the points at which the interpolant is fitted. This concept is shown in Figure 21. In the left image the output function is fitted to well selected samples. In the centre image, some samples showing a high error and outlying the correct estimate were not filtered, resulting in an estimate function (solid line) not reproducing the correct function (dashed line). In the right image, the outliers are still present, but thanks to the error aware interpolator, the function would ideally be better or correctly estimated. Note that ideally this error envelope should be user defined, exploiting the a priori quality knowledge offered by SAR Precision, Optical cross-correlation, slope and feature type, for each input point.

**Figure 21.** Example of a function estimated from an exact interpolator (left), from an exact interpolator with outliers (center) and from an error aware interpolator (right). The solid line is the estimate function, the dashed line the correct function. Green dots are good selected samples, red dots bad selected samples, vertical lines define the error enve‐ lope.

The first family of interpolators include, for instance, the Radial Basis Function [77], which is widely recognized as reliable estimator. This kind of interpolators, however, does not take into account the quality index of each point, but, after a well-executed sample selection step, the fusion can be performed. Ideally, interpolators taking into account the confidence interval are better suited, especially if they offer the possibility to specify the confidence range at each data point while computing the interpolant.

The size problem, however, still remains. By using advanced interpolators, the results improve greatly in accuracy, but increase in computational cost, memory cost and complexity. A way to avoid these problems is to implement the RBF following a structure similar to the one of the Shepard Interpolator, as proposed in [78]. The process is resumed in Figure 22.

**Figure 22.** Diagram summarizing the steps of the proposed fusion approach.

This approach also leaves the possibility to decide the regionality and distance influence of the interpolator, allowing for a fine tuning and a good trade-off between over-smoothing and overfitting. Being very dense datasets, the possibility to control how local the interpolant should be is very important. An example of DEMs fused with this approach is shown in Figure 23.

**Figure 23.** Examples of an ascending ERS-1/2 DEM (bottom-left), descending ERS-1/2 DEM (bottom-right), DEM fu‐ sion results (bottom-center) and the full DEM resulting from the fusion (top), over a region in Turkey.

The second family of interpolators is composed by more advanced techniques, which are currently used in a wide variety of study fields, as well as environmental studies. See [76] to have an extensive collection of such approaches. When taking into consideration these approaches, however, the very large size of the datasets becomes even more an issue. The computational and memory costs may well become prohibitive. A possible solution could be to apply a concept similar to the one introduced for Radial Basis Functions, reducing consid‐ erably the issues.

## **7. Conclusions**

In this chapter, the SAR and Optical techniques applied to produce DEMs were introduced, leading to the topic of elevation data fusion. The main reasons strengthening the desirability of such an approach, especially at an inter-platform level, have been outlined. The fusion topic is still, however, an ongoing issue which has to be further investigated. The interest in developing this topic may even grow in the future. Especially considering the pace at which new instruments capable to extract elevation information are introduced, bringing with them new issues and shortcomings which may well be overcome by exploiting their complemen‐ tarities. The production of worldwide "absolute" single-date single-sensor elevation models still remains utopic, even for new missions such as TanDEM-X, Sentinel or Pléiades. The shortcomings of the instruments will be lessened, their spatial resolution improved, but their intrinsic characteristics will remain, along with their complementarity and therefore the interest in their fusion. Temporal resolution is also an important aspect, since throughout the year the ground's physical characteristics can induce highly varying height estimates. The fusion of multi-date elevation models would allow to obtain products with a much more reliable terrain reproduction ability. Each fusion processing level has its advantages, the raster level surely is easier to handle than the point cloud one, while the latter offers a less biased data reproduction. For the sake of accuracy and reliability, the point cloud level is undoubtedly the better option on which to base future approaches. The dataset size-related issues opposing these approaches can be overcome by improving sample selection or by improving the interpolant computation step, greatly increasing the speed and manageability of the ap‐ proaches. Finally, approaches giving the ability to finely exploit the data intrinsic information to guide the fusion should be investigated.

### **Acknowledgements**

The authors would like to acknowledge Dr Giorgio Agugiaro (Bruno Kessler Foundation, Trento, Italy), Dr Emanuele Angiuli and Ivano Caravaggi (Joint Research Center, Ispra, Italy) for their support in satellite image processing, DEM extraction and quality assessment.

The work on Trento testfield was partly supported by the 3M project (co-founded Marie-Curie Actions FP7 – PCOFOUND – GA-2008-226070, acronym "Trentino Project"). The GeoEye-1 images acquired over Panama City (Panama) and San Salvador (El Salvador) belong to WorldBank. The authors would like to thank WorldBank for giving JRC the opportunity to use the data for research purposes, the Autonomous Province of Trento (PAT) and the Municipality of Trento, for providing spatial data for the Trento testfield, and Astrium GEO-Information Services for providing the Pleiades triplet for research and investigation purposes.

# **Author details**

Loris Copa1\*, Daniela Poli<sup>2</sup> and Fabio Remondino<sup>3</sup>

\*Address all correspondence to: lcopa@sarmap.ch

1 Sarmap, Cascine di Barico, Purasca, Switzerland

2 Terra Messflug GmbH, Imst, Austria

3 Bruno Kessler Foundation, Povo-Trento, Italy

### **References**


phy Mission – Data Validation and Applications, Workshop, June 14–16 2005, Re‐ ston, Virginia, USA. ASPRS.


ings of the ISPRS Hannover Workshop 2005: High-Resolution Earth Imaging for Ge‐ ospatial Information, 2005. May 17-20. Hannover, Germany. ISPRS.

